33 research outputs found
Deep Structured Features for Semantic Segmentation
We propose a highly structured neural network architecture for semantic
segmentation with an extremely small model size, suitable for low-power
embedded and mobile platforms. Specifically, our architecture combines i) a
Haar wavelet-based tree-like convolutional neural network (CNN), ii) a random
layer realizing a radial basis function kernel approximation, and iii) a linear
classifier. While stages i) and ii) are completely pre-specified, only the
linear classifier is learned from data. We apply the proposed architecture to
outdoor scene and aerial image semantic segmentation and show that the accuracy
of our architecture is competitive with conventional pixel classification CNNs.
Furthermore, we demonstrate that the proposed architecture is data efficient in
the sense of matching the accuracy of pixel classification CNNs when trained on
a much smaller data set.Comment: EUSIPCO 2017, 5 pages, 2 figure
The Possibility of Transfer(?): A Comprehensive Approach to the International Criminal Tribunal for Rwanda’s Rule 11bis To Permit Transfer to Rwandan Domestic Courts
We present a learned image compression system based on GANs, operating at
extremely low bitrates. Our proposed framework combines an encoder,
decoder/generator and a multi-scale discriminator, which we train jointly for a
generative learned compression objective. The model synthesizes details it
cannot afford to store, obtaining visually pleasing results at bitrates where
previous methods fail and show strong artifacts. Furthermore, if a semantic
label map of the original image is available, our method can fully synthesize
unimportant regions in the decoded image such as streets and trees from the
label map, proportionally reducing the storage cost. A user study confirms that
for low bitrates, our approach is preferred to state-of-the-art methods, even
when they use more than double the bits.Comment: E. Agustsson, M. Tschannen, and F. Mentzer contributed equally to
this work. ICCV 2019 camera ready versio
Practical Full Resolution Learned Lossless Image Compression
We propose the first practical learned lossless image compression system,
L3C, and show that it outperforms the popular engineered codecs, PNG, WebP and
JPEG 2000. At the core of our method is a fully parallelizable hierarchical
probabilistic model for adaptive entropy coding which is optimized end-to-end
for the compression task. In contrast to recent autoregressive discrete
probabilistic models such as PixelCNN, our method i) models the image
distribution jointly with learned auxiliary representations instead of
exclusively modeling the image distribution in RGB space, and ii) only requires
three forward-passes to predict all pixel probabilities instead of one for each
pixel. As a result, L3C obtains over two orders of magnitude speedups when
sampling compared to the fastest PixelCNN variant (Multiscale-PixelCNN).
Furthermore, we find that learning the auxiliary representation is crucial and
outperforms predefined auxiliary representations such as an RGB pyramid
significantly.Comment: Updated preprocessing and Table 1, see A.1 in supplementary. Code and
models: https://github.com/fab-jul/L3C-PyTorc
Multi-Realism Image Compression with a Conditional Generator
By optimizing the rate-distortion-realism trade-off, generative compression
approaches produce detailed, realistic images, even at low bit rates, instead
of the blurry reconstructions produced by rate-distortion optimized models.
However, previous methods do not explicitly control how much detail is
synthesized, which results in a common criticism of these methods: users might
be worried that a misleading reconstruction far from the input image is
generated. In this work, we alleviate these concerns by training a decoder that
can bridge the two regimes and navigate the distortion-realism trade-off. From
a single compressed representation, the receiver can decide to either
reconstruct a low mean squared error reconstruction that is close to the input,
a realistic reconstruction with high perceptual quality, or anything in
between. With our method, we set a new state-of-the-art in distortion-realism,
pushing the frontier of achievable distortion-realism pairs, i.e., our method
achieves better distortions at high realism and better realism at low
distortion than ever before
Learning for Video Compression with Hierarchical Quality and Recurrent Enhancement
In this paper, we propose a Hierarchical Learned Video Compression (HLVC)
method with three hierarchical quality layers and a recurrent enhancement
network. The frames in the first layer are compressed by an image compression
method with the highest quality. Using these frames as references, we propose
the Bi-Directional Deep Compression (BDDC) network to compress the second layer
with relatively high quality. Then, the third layer frames are compressed with
the lowest quality, by the proposed Single Motion Deep Compression (SMDC)
network, which adopts a single motion map to estimate the motions of multiple
frames, thus saving bits for motion information. In our deep decoder, we
develop the Weighted Recurrent Quality Enhancement (WRQE) network, which takes
both compressed frames and the bit stream as inputs. In the recurrent cell of
WRQE, the memory and update signal are weighted by quality features to
reasonably leverage multi-frame information for enhancement. In our HLVC
approach, the hierarchical quality benefits the coding efficiency, since the
high quality information facilitates the compression and enhancement of low
quality frames at encoder and decoder sides, respectively. Finally, the
experiments validate that our HLVC approach advances the state-of-the-art of
deep video compression methods, and outperforms the "Low-Delay P (LDP) very
fast" mode of x265 in terms of both PSNR and MS-SSIM. The project page is at
https://github.com/RenYang-home/HLVC.Comment: Published in CVPR 2020; corrected a minor typo in the footnote of
Table 1; corrected Figure 1
Neural Video Compression using GANs for Detail Synthesis and Propagation
We present the first neural video compression method based on generative
adversarial networks (GANs). Our approach significantly outperforms previous
neural and non-neural video compression methods in a user study, setting a new
state-of-the-art in visual quality for neural methods. We show that the GAN
loss is crucial to obtain this high visual quality. Two components make the GAN
loss effective: we i) synthesize detail by conditioning the generator on a
latent extracted from the warped previous reconstruction to then ii) propagate
this detail with high-quality flow. We find that user studies are required to
compare methods, i.e., none of our quantitative metrics were able to predict
all studies. We present the network design choices in detail, and ablate them
with user studies.Comment: First two authors contributed equally. ECCV Camera ready versio